Developing Deployable Spoken Language Translation Systems given Limited Resources

نویسنده

  • Matthias Eck
چکیده

Automatic machine translation systems in research institutions have reached a considerable level of performance. This is especially true for Statistical Machine Translation. Since its introduction in the 1990s, it has outperformed earlier approaches and produces a translation quality that seemed impossible only a short time ago. Particularly for limited domains like tourism dialogs, medical relief or force protection the translations could already be very useful for real applications outside of laboratories or demonstrations. The question here is no longer how to get the technology to an acceptable level, but how the existing technology can be globally deployed. How can a system be developed that tourists, health professionals or soldiers can actually use and benefit from? The following lists the issues which need to be adressed in order to make a current research system deployable to real users: Research system Deployable system Languages Limited number Many Language Pairs Hardware High-End Server Notebook, Mobile Device Application Evaluations, Demonstrations Actual Communication Evaluation BLEU/NIST Scores Communication success Users Expert researchers Tourists, Medical users, Military users Interface Complex, command line Easy, interactive The first difference between a research system and a deployable system is the number of generally supported languages. In research most groups focus on a few language pairs for which large training corpora are available. Systems for tourists and especially for medical relief or military uses will have to support a much larger number of languages. It is especially necessary to rapidly support specific language pairs if a sudden demand develops. In research labs the translation systems are usually run on high-powered, expensive servers to be able to use larger and more complicated models and to realize even small performance improvements. The users are researchers and

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Translation for Government Applications

Thousands of languages are spoken in the world. The commercial sector provides significant translation capabilities for a few dozen of these, both in the form of trained professionals and in the form of machine translation (MT) systems. However, the U.S. government has many unmet foreign-language requirements that are likely to remain unmet by the commercial sector. Limited resources have affec...

متن کامل

Theoretical and methodological issues regarding the use of Language Technologies for patients with limited English proficiency

This paper concerns the use of spoken language translation as well as other technologies to support communication between clinicians and patients where the latter have limited proficiency in the majority language. The paper explores some theoretical and methodological issues, in particular the question of whether it is the patient or clinician who should be seen as the primary user of such soft...

متن کامل

Language-independent hybrid MT with PRESEMT

The present article provides a comprehensive review of the work carried out on developing PRESEMT, a hybrid language-independent machine translation (MT) methodology. This methodology has been designed to facilitate rapid creation of MT systems for unconstrained language pairs, setting the lowest possible requirements on specialised resources and tools. Given the limited availability of resourc...

متن کامل

Efficient Language Model Construction for Spoken Dialog Systems by Inducting Language Resources of Different Languages

Since the quality of the language model directly affects the performance of the spoken dialog system (SDS), we should use a statistical language model (LM) trained with a large amount of data that is matched to the task domain. When porting an SDS to another language, however, it is costly to re-collect a large amount of user utterances in the target language. We thus use the language resources...

متن کامل

Computing Consensus Translation from Multiple Machine Translation Systems

In this paper, we address the problem of computing a consensus translation given the outputs from a set of Machine Translation (MT) systems. The translations from the MT systems are aligned with a multiple string alignment algorithm and the consensus translation is then computed. We describe the multiple string alignment algorithm and the consensus MT hypothesis computation. We report on the su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008